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SPECTRAL CALIBRATION OF FLUORESCENT 
POLYNUCLEOTIDE SEPARATION APPARATUS 



Field of the Invention 

The invention in the field of spectral calibration of fluorescence based automated 
polynucleotide length measurement instruments. 

5 Background 

Spectral calibration is to estimate reference spectral profiles (reference spectra) of 
particular fluorescent dyes using the optical measurement system of a automated DNA 
sequencer or similar fluorescent polynucleotide separation apparatus where the particular dyes 
will be utilized. The current practice of spectral calibration relies on measuring the spectral 

10 profile of each fluorescent dye separately. This approach to spectral calibration of fluorescent 
polynucleotide separation apparatus results in reduced throughput because it requires N lanes on 
gel-based instruments and requires N separate runs on capillary-based instrument. As more 
fluorescent dyes are developed and utilized routinely (N is expected to increase,) the spectral 
calibration of fluorescent polynucleotide separation apparatus becomes more demanding and less 

1 5 efficient under the current practice. Additionally, the amount of computer resources devoted to 
spectral calibration also increases with the number of dyes and separation channels analyzed. 



20 fluorescent polynucleotide separation apparatus. Fluorescent polynucleotide separation 
apparatus, such as automated DNA sequencer, must be spectrally calibrated for use with the 
different fluorescent dyes to be used in conjunction with the separation system. 

One aspect of the invention is multiple color calibration standards and their use. A 
multiple color calibration standard is a mixture of at least two polynucleotide of different length, 

25 wherein each of the polynucleotide is labeled with a spectrally distinct fluorescent dye. In a 
preferred embodiments of the invention, the multiple color calibration standard comprise at least 
four polynucleotide of different lengtii, and each of the polynucleotide is labeled with a spectrally 
distinct dye. The invention includes numerous methods of spectrally calibrating a fluorescent 
polynucleotide separation ^paratus with a multiple color calibration standard. 



Summary 

The invention relates to methods, compositions, and systems for calibrating a 
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Another aspect of the invention is to produce total emission temporal profiles of 
multiple color calibration standards for use in calibrating fluorescent polynucleotide separation 
apparatus. A total emission temporal profile is a sum of the intensities of the fluorescence signal 
obtained in all spectral channels as a function of time. The peaks corresponding to the 
5 fluorescently labeled polynucleotide in the total emission temporal profile may be detected using 
a peak detector that is driven by changes in the slopes of the total emission temporal profile. 
Calibration of fluorescent polynucleotide separation apparatus with various embodiments of the 
methods of the invention include the step of identification of the labeled polynucleotide of the 
multiple color calibration standards. The process of spectral calibration of fluorescent 

10 polynucleotide separation apparatus using multiple color calibration standard may include the 

step of the estimating (extracting) of the dyes' reference spectra, using information from the * 
peak detection process performed on the total emission temp)oraI profile. 

Other aspects of the invention include systems for separating and detecting fluorescently 
labeled polynucleotide, wherein the system is designed for spectral calibration in accordance with 

1 5 the subject calibration methods employing multiple color calibration standards. 

Other aspects of the invention include systems for separating and detecting 
fluorescently labeled polynucleotide, wherein the system is designed for spectral calibration in 
accordance with the subject calibration methods employing multiple color calibration standards. 
The subject systems comprise a fluorescent polynucleotide separation apparatus and a computer 

20 in functional combination with the apparatus. 

Another aspect of the invention is methods and compositions for detecting the flow of 
electrical current through a separation channel of a fluorescent polynucleotide separation 
apparatus. These methods an compositions employ monitoring dyes. Monitoring dyes are 
fluorescent dyes that are spectrally distinct from the dye on the polynucleotide intended to 

25 convey genetic information, e.g., fluorescent polynucleotide sequencing reaction products. 

Brief Description of the E)rawings 
Figure 1 is diagram of an example of a portion of a temporal profile labeled as to show 
examples of some of the terms used herein. 

30 
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Definitions 

The tenn ^^fluorescent polynucleotide separation apparatus'' as used herein denotes 
an apparatus for separating fluorescently labeled polynucleotide mixtures {eg, by 
electrophoresis) and detecting the separated polynucleotides by the fluorescence emission 
5 produced from exciting the fluorescent dye. Examples of fluorescent polynucleotide separation 
apparatus include automated DNA sequencers such as the PE Applied Biosystems 3 1 0 and 377 
(Foster City, California). Examples of fluorescent polynucleotide separation apparatus are also 
described in , among other places, U.S. Patents Nos. 4,971 ,677; 5,062,942; 5,2 1 3,673; 5,277,780; 
5,307,148; 4,81 1,21 8; and 5,274,240. The term fluorescent polynucleotide separation apparatus 

10 also includes similar instruments for polynucleotide fragment length analysis that are not capable 
of the single base pair resolution required to obtain DNA base sequence information. Fluorescent 
polynucleotide separation apparatus comprises one or more separation regions or channels, 
typically the path of electric current flow in electrophoretic separation devices. Types of 
separation channels include capillaries, microchannels, tubes, slab gels, and the like. Fluorescent 

1 5 polynucleotide separation apparatus collect several types of data during their operation. This data 
includes spectral data and temporal data relating to the fluorescent labeled polynucleotides 
separated by the apparatus. Typically, such data is collected by a detector {e.g, a CCD array, 
photomultiplier tubes, and the like) designed to obtain quantitative spectral data over a 
predetermined region or regions of the separation channels. Spectral data collected by the 

20 apparatus includes the intensity of fluorescence at a plurality of wavelengths. The different 
wavelengths sampled are referred to as bins or channels. The apparatus also collects temporal 
data that is correlated with the spectral data. The temporal data is collected at numerous different 
time points. For example, a detector at a fixed position will measure increases and decreases in 
fluorescence intensity as a function of time as a labeled polynucleotide peak passes by the 

25 detector. This temporal data may be expressed as "frame" or "scan" nxmiber to indicate the 
different temporal sampling points, 

A temporal profile is a plot of the intensity of a spectral signal as a ftinction of time 
or scan/frame number. A temporal profile consists of systematic and random variations. 
Systenmtic variations are caused by peaks, spikes and background drifts. These variations cause 

30 the shape of the profile to undergo specific, and often predictable, changes. By contrast, random 
variations do not cause specific or predictable changes in the temporal profile, A temporal profile 
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has segments that correspond to baseline (baseline segment) and segments that correspond to 
peaks (peak segments), and segment that correspond to spikes. Baseline segments are made of 
random variations superimposed on offset value(s). 

An emission temporal profile is a plot of the intensity of the signals obtained in a 
5 certain spectral channel/bin as a function of time or scan/frame number. 

A total emission temporal profile is a plot of the sum of the intensities of the signals 
obtained in all spectral channels/bin as a function of time or scan/frame number. 

The analytical background of a temporal profile is the average of the signals obtained 
along a segment of the profde where the segment is void of peaks, spikes and systematic 

10 variations (L e., a baseline segment.) This is schematically shown in Figure 1 . The analytical 
noise of a temporal profile is the standard deviation of the signals obtained along a segment of 
the profile where the segment is void of peaks, spikes and systematic variations. Analytical 
background and noise may change as a function of time along the temporal profile. This occurs 
when there are drifts in the background. 

15 The term net analytical signal refers to the intensity at any point of a profile after 

correcting for background and baseline offsets and/or drifts.. The analytical signal to noise ratio 
(S/N) is the ratio of the net analytical signal to the analytical noise. Net analytical signals may, 
or may not, be significant depending on their S/N's 

A peak detector is a mathematical transformation of a profile {e.g. a temporal profile) 

20 whose purpose is to locate peaks along the profile. A peak detector is defined by the type of the 
transformation, and the detection parameters associated with its operation. A typical peak 
detector distinguishes between segments of a profile that represent baseline (an offset with 
random noise ) and other segments that represent peaks and spikes based on the slope of the 
temporal profile. From the peak detector's point of view, a baseline segment is a set of data 

25 points along the temporal profile where the absolute value of the slope of the profile does not 
exceed the peak detector's threshold. An ideal peak detector ignores baseline and spikeTsegments, 
and retains information relevant only to peaks (in our case the component polynucleotides of the 
multiple color calibration standard.) 

Peak slope threshold is a value which if exceeded by the slope of a temporal profile, 

30 the presence of a potential peak is indicated. This value may be referred to as the ''threshold^' 
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parameter of the peak detector. If a peak is actually present, the threshold value is also used to 
indicate that the temporal profile has returned to baseline levels and that the peak has ended. 

Peak start is the first point along the peak segment of a temporal profile. A peak start 
may be found at baseline levels, or in the valley between two peaks. Peak end is the last point 
5 along the peak segment of a temporal profile. A peak end may be found at baseline levels, or in 
the valley between two peaks. Peak maximum is a point along the peak segment of a profile 
where the highest intensity is found. Peak width is the number of data points between the start 
of the peak and the end of the peak (see Figure 1.) The peak width attribute is helpftil in 
discriminating between peaks that correspond to labeled DNA fragments and spikes. The latter 

1 0 have relatively smaller peak widths. 

Peak height at maximum is the intensity at peak maximum corrected for the analytical 
background (see Figure 1 .) Peak S/N ratio refers to the ratio of the peak height at maximum to 
the analytical noise of the temporal profile, A peak's S/N attribute is an effective parameter that 
is used to retain the peak information of the dye-labeled fragments of the multiple color 

1 5 calibration standard. 

Migration time of a peak is the time elapsed fi-om the start of the electrophoresis to peak 
maximum. A particular peak corresponding to a certain labeled polynucleotide of the multiple 
color calibration standard may serve as a reference peak whose migration time is a reference 
point fi-om which the migration time of other peaks are measured. 

20 Migration time offset is the difference between the migration time of a particular peak 

and the migration time of the reference peak (see Figure 1 .) Peaks to the left of the reference peak 
will have negative migration time offsets, while those to the right of the reference peak will have 
positive migration time offsets. Reference peaks are located based on rank or migration time. 
Subsequently, migration time offsets are used to locate all other dye-labeled fragments. 

25 Input parameters are attributes that are used by a particular implementation of the 

algorithm. These parameters may be specific to the multiple color calibration standard as well 
as to the platform being used. The implementation attributes may include the peak width, the 
threshold variable, the peak S/N ratio, the reference peak locator (migration time v^. rank) the 
migration time offsets, and the appropriate tolerances, if necessary, to account for instrumental 

30 and experimental variations. 
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The term "polynucleotide" as used herein refers to naturally occurring polynucleotides 
such as DNA and RNA and to synthetic analogs of naturally occurring DNA, e,g, 
phosphorothioates, phosphoramidates, peptide nucleic acids (PNAs), and the like. The term 
"polynucleotide" does not convey any length limitation and should be read to include in vitro 
5 synthesized oligonucleotides. 

Specific Embodiments of the Invention 
The invention relates to methods, compositions, and systems for calibrating a 
fluorescent polynucleotide separation apparatus. Fluorescent polynucleotide separation 

10 apparatus, such as automated DNA sequencer, must be spectrally calibrated for use with the 
different fluorescent dyes to be used in conjunction with the separation system. Spectral 
calibration may also be used to account for variations between individual fluorescent 
polynucleotide separation apparatus and account for changes that occur in a given instrument 
over time. Fluorescent dyes have characteristic emission spectra for a given excitation 

1 5 wavelength. When multiple different dyes are present in a mixture for separation, the individual 
contributions of the different dyes to a spectral detection reading must be separated from one 
another. Such separation may be achieved through the use of a matrix containing spectral 
emission data of the various dyes used for analysis, see Yin et a/.. Electrophoresis 1 7: 1 1 43-1 1 50 
( 1 996) and U.S. patent application 08/659,1 1 5, filed June 3, 1 996. The generation of a spectral 

20 calibration data matrix for calibrating a fluorescent polynucleotide separation apparatus typically 
includes the steps of introducing a fluorescent polynucleotide calibration standard into a 
fluorescent polynucleotide separation apparatus, separating the labeled polynucleotides from each 
other, and detecting the separated polynucleotides with a detector. The detector collects spectral 
information relating to the concentration of labeled polynucleotides at a specific location (or 

25 locations) on the apparatus. The information collected is the fluorescent emissions at a plurality 
of wavelengths, (e.g. bins/channels). The information obtained by the detector includes the 

4 

4 

recording of temporal data {e.g. scan number, for a fluorescent polynucleotide separation 
apparatus that employs a scanning detector) correlated with the spectral emission data for the 
measured time points. 

30 One aspect of the invention is to produce total emission temporal profiles of multiple 

color calibration standards for use in calibrating fluorescent polynucleotide separation apparatus. 
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A total emission temporal profile is a sum of the intensities of the fluorescence signal obtained 
in all spectral channels as a function of time. Peaks corresponding to the different 
oligonucleotides in the multiple color calibration standard may then be determined by analyzing 
the total emission temporal profile with a peak detection transformation function. A reference 
5 spectrum for each of the fluorescent dye of interest used in the multiple color calibration standard 
may then be produced by selecting a reference spectrum that substantially corresponds to the 
relevant peak of the total emission profile. 

Other aspects of the invention are multiple color calibration standards and their use. A 
multiple color calibration standard is a mixture of at least two polynucleotides of different length. 

10 (It will be understood by person skilled in the art that each polynucleotide is present in a large 
number of essentially identical copies so as to provide useful amounts of the subject 
compositions) Preferably, the length (in number of bases) of each labeled FK)lynucleotide is 
known precisely so as to maximize the accuracy of the standard. Each of the different length 
polynucleotides in the standard is labeled with a different fluorescent dye. The predetermined 

1 5 correlation between the length of the given polynucleotide and the particular fluorescent dye that 
is attached to that polynucleotide is used to identify the polynucleotide of the multiple color 
calibration standard during the calibration process. The different fluorescent dyes are selected 
so as to have distinctive spectral profiles (for the same excitation frequency). Preferably the sizes 
of the polynucleotide in the multiple color calibration standard are selected so as to ensure 

20 sufficient separation between the polynucleotide labeled with different dyes that the spectral 
profile peaks of the fluorescent dyes do not significantly overlap. In other words, there is 
preferably sufficient difference between the lengths of the constituent polynucleotides so that for 
any given polynucleotide peak that is being detected, the possibility that the fluorescence 
intensity readings are the result of multiple different dyes is minimal. 

25 The sizes of the polynucleotides that are in multiple color calibration standards are 

selected so as to be within the size separation for the particular fluorescent polynucleotide 
separation apparatus for which they are designed to be used. Exemplary of such a range is about 
1 0 - 1 500 bases in length, preferably about 1 0 - 1 000 bases in length, more preferably about 20 - 
500 bases in length. Preferably polynucleotides in the standard are separated by at least 1 0 bases 

30 in length. Methods of making the polynucleotide components of the subject standards are well 
known to persons of ordinary skill in the art. Such methods include the complete in vitro 
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synthesis of the polynucleotide, eg. through the use of phosphoramidite chemistry. 
Alternatively, the polynucleotides may be synthesized enzymatically. For example a PGR 
(polymerase chain reaction) amplification may be performed using primers separated by the 
desired distance, wherein on of the amplification primers is labeled with a fluorescent dye of 
5 interest. 

In preferred embodiments of the invention, the multiple color calibration standard 
comprise at least four polynucleotides of different length, and each of the polynucleotides is 
labeled with a spectrally distinct dye. The use of four spectrally distinct dyes, each being 
essentially the same as the dyes used for producing polynucleotide sequencing reaction products 

10 is of particular interest for use in four color chain termination type sequencing (employing either 
fluorescently labeled chain terminating nucleotides fluorescently labeled primers). The multiple 
color calibration standard may comprise one or more fluorescent dyes in addition to the dyes in 
the standard that correspond to the dyes used in sequencing reactions that are designed for use 
in conjunction with the particular standard. These additional dyes may be "signal dyes" as 

15 described later in this application. These additional dyes, which are preferably attached to 
polynucleotides, may be used to monitor the electrical current flow through the separation 
channel or channels of a fluorescent polynucleotide separation apparatus. While detection of 
electrical current flow through a fluorescent polynucleotide separation apparatus without the use 
of additional dyes is relatively simple for apparatus employing a single separation channel, e.g. 

20 a slab gel, the detection of current through a multi-channel system, e.g., a multiple capillary 
system, is difficult without using additional dyes. The movement of these additional dyes, which 
should also be added to the sample for analysis, through the fluorescent polynucleotide apparatus 
may be detected in order to verify the flow of electrical current through a separation channel, e.g. 
an individual capillary. 

25 The invention also includes kits for performing the subject method. The kits comprises 

the individual fluorescently labeled polynucleotide components of the subject miiffiple color 
spectral calibration standards. By providing the individual components of a standard, end users 
may conveniently produce there own standard for specific applications. 

A wide variety of fluorescent dyes may be used to label the polynucleotides in multiple 

30 color calibration standards. Fluorescent dyes are well known to those skilled in the art. Example 
of fluorescent dyes include fluorescein, 6-carboxyfluorescein, 2',4',5',T,-tetrachloro-4,7- 
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dichlorofluorescein, 2',7-dimethoxy-4\5 -6-carboxyrhodamine (JOE), N'^\N',N'-tetramethyl- 
6-carboxyrhodamine (TAMRA) and 6-carboxy-X-rhodamine (ROX). Fluorescent dyes are 
described in, among other places, U.S. patent 4,855,225; Menchen et al, U.S. patent 5,1 88,934; 
Bergot et al, International Application PCT/US90/05565; Haugland, R.P., Handbook of 
5 Fluorescent Probe and Research Chemicals , 6th edition (1996) and like references. Examples 
of such dyes include methods of attaching fluorescent dyes to polynucleotides are also well 
known to those skilled in the art. Examples of such attachment methods can be found in, among 
other places, U.S. Patent Nos. 4,789,737; 4,876,335; 4,820,812; and 4,667,025. 

The multiple color calibration standards of the invention may also comprise various 

10 other components in addition to fluorescent labeled polynucleotides. Such additional 
components may be used to improve the movement of the polynucleotide through a separation 
channel of a flour scent p>olynucleotide separation apparatus. Examples of additional 
components include, but are not limited to, buffers, denaturants, and the like. 

The invention includes numerous methods of spectrally calibrating a fluorescent 

1 5 polynucleotide separation apparatus with a multiple color calibration standard. A multiple color 
calibration standard is introduced, Le., loaded, into a fluorescent polynucleotide separation 
apparatus. The introduction of a multiple color calibration standard into a florescent 
polynucleotide separation apparatus and the subsequent separation of the component of the 
standard along with the collection of the spectral and temporal data obtained from detecting the 

20 separated labeled polynucleotides may be conveniently referred to as producing a spectral 
calibration run. Spectral calibration run may runs may be performed on a single separation 
channel or may be simultaneously performed on serval separation channel. 

A spectral calibration run produces data that can be conveniently be analyzed in the 
form of a matrix, D, with R rows and C columns that contains the measured intensities in each 

25 spectral channel/bin (the colunms of the data matrix) as a function of time or frame/scan number 
(the rows of the data matrix). Each of the C columns represents an emission temporafprofile for 

/ 

the corresponding spectral channel/bin. Each of the R rows represent the spectrum acquired 
dxuing the corresponding data collection/acquisition period. The person of skill in art may devise 
numerous equivalent representations of the data obtained from a calibration run rather the 
30 specific matrix described above, e.g. the components of the rows and colunms may be transposed 
or the data may be manipulated without the use of a 2-D matrix. Each temporal profile contains 
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peaks of different shapes that correspond to the dye-labeled polynucleotides of the multiple color 
calibration standard. The shape of each of these peaks depends on the emission characteristics 
of the corresponding dye at the specific spectral channel/bin represented by the temporal profile. 
A total emission temporal profile may then be prepared by sunmiing the intensities of the signals 
5 obtained for all spectral channels/bins as a function of the temporal parameter, e.g. scan/frame 
number. Ideally, the emission temporal profiles for the labeled polynucleotides of a multiple 
color spectral calibration standard are "parallel." In practice, however, this ideal property may 
show deviations that are caused by heterogeneous emission efficiencies, baseline drifts, minor 
spectral measurements anomalies and deviations from the analytical linear dynamic range. 

10 Despite sharing important general properties (peaks of multiple color spectral calibration 
standard constituent labeled polynucleotide separated by baseline segments,) the temporal 
profiles of the individual spectral channels/bins may exhibit large variations in S/N ratios, noise 
distribution as well as peak shapes. In order to minimize such problems, total emission temporal 
profiles may be used for calibration rather than individual emission temporal profiles. An 

15 advantage of total emission profiles is the inclusion of all polynucleotide components of the 
standard regardless of differences in emission intensities between the spectral channels/bins. The 
total emission profile, thus, provides a temporal profile that contains all the peaks of the multiple 
color spectral calibration standards labeled polynucleotide, and only one set of detection input 
parameters is necessary, 

20 The peaks corresponding to the fluorescently labeled polynucleotide in the total 

emission temporal profile may be detected using a peak detector that is driven by changes in the 
slopes of the total emission temporal profile. When the slope of the total emission temporal 
profile exceeds a certain threshold , the start of a potential peak is detected. The potential peak 
may then be traced through its crest/maximum and until the potential peak ends by either having 

25 the total emission temporal profile returns to background levels, or detecting the start of another 
peak. The information regarding the start, maximum and end of the potential peak ifiiay then be 
evaluated to assess the significance of the peak. Only significant peaks (in terms of the minimum 
requirements indicated by the peak width and peak S/N ratio input parameters) are used to select 
reference spectra. This process may be used to reject spikes and insignificant/non-target peaks 

30 while retaining the peaks corresponding to the components of the multiple color calibration 
standard. 



wo 00/16087 PCTAJS99i70836 

Peak Detection Transformation 

Peak detection is perforaied on a total emission temporal profile. A preferred 
transformation to detect peaks is the slope of the total emission temporal profile, and is given as: 

Si = -Ii) + (Ih2-Im) 0) 

where Sj is the slope (as estimated by the detection transformation) at point 1, and I ^ is_the 
intensity of the total emission temporal profile at point k. However, other peak detection 
transformations based on changes of intensity may also be used in the subject methods. 



10 



Statistical Distribution of Detection Transformation And Failure Analysis 

The threshold parameter used in a peak detector may be an actual value for the slope. 
However, in a preferred embodiment of the invention the threshold is determined by the 
distribution of the peak detection transformation based on a probabilistic model An input 
1 5 variable is used to estimate the threshold. The detection transformations produce a parameter, 
for example S in Equation 1, that is used for peak detection. The performance of S in 
distinguishing baseline segments from peak segments in a temporal profile is highly influenced 
by the distribution of S when I is subjected to random variations only. The variance in S can be 
estimated by applying error propagation theory to Equation 1, and is given according to: 



20 



25 



a\S)= S{[aF(S)/aiJ^a^(It)} 

where F(S) is the detection transformation ( Equation I). For independent measurements, the 
above expression reduces to: 

o\S) = 4a^a) (2) 



Thus, segments of a temporal profile that correspond to baselines with random vernations are 
expected to produce amplified variations, according to Equation 2 , after the detection 
30 transformation. 

The start of a peak is considered the first data point along the peak segment of the total 
emission profile that does not belong in the baseline population. The baseline segment's 
population produces a transformation distribution with a variance of 4o*(I) (Equation 2). The 
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S distribution's variance can, therefore, be used to set a detection threshold with a probability of 
failure (incorrectly classifying a data point from the baseline population as the start of a peak 
segment) that is given as 



where |i(S) is the mean of the S distribution, and is expected to be zero. 

For example if the threshold is set at 3a(S), The probability of selecting a data point 
from the baseline segment's population as a peak start is, according to Equation 3, 1 00/9 or about 

10 1 1%. ( Equation (3) does not assume a Gaussian, or any other, distribution of the baseline data 
points population.) To decrease the probability of failure threshold may be increased, or consider 
the peak start as two consecutive data points whose transformation exceeds the threshold value. 
If the threshold is set, again, at 3a(S), the probability of Sj exceeding this value at two 
consecutive measurements when only random variation are present is about 1%. The peaks 

15 corresponding to the labeled polynucleotides of the multiple color calibration standards are 
expected to be among the peaks with the highest peak S/N ratios. Since all detected peaks may 
be subjected to additional criteria such as minimum peak S/N ratio and minimimi peak width, 
false peak starts (detected with a probability of 1% as outlined above) are not expected to cause 
any significant problems in detecting and retaining the peaks corresponding to the labeled 

20 polynucleotides of the multiple color calibration standards while rejecting spikes and other non- 
target peaks. 

The outcome of the peak detection process is a set of attributes for all peaks that satisfy 

the minimum peak width and the minimum peak S/N ratio requirements. This information 

includes the data point at the start of the peak, the data point at the end of the peak. Appropriate 

25 descriptors indicating whether the peak start point is at baseline levels or in a valley between two . 

peaks are also compiled during the peak detection process. Similarly, peak end point^are flagged 

$ 

as either being at baseline levels or in a valley between two peaks. Peak information also 
includes the data point at which the peak maximizes, and the intensity at the peaks' maxima as 
well as the actual peak width. Where available, the locations of baseline segments to the left of 
30 the peak start and to the right of the peak end may also be compiled. 



Pr[ |Si-M(S)| ^ ko(S)J ^ k-^ 



(3) 
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Identification of the components of Multiple Color Calibration Standards 

Calibration of fluorescent polynucleotide separation apparatus with various 
embodiments of the methods of the invention include the step of identification of the labeled 
polynucleotides of the multiple color calibration standards. The identification of the colored 
5 ladder firagments refers to the assignment of each labeled polynucleotide in a multiple color 
calibration standard to one of the peaks retained by the peak detector. Assignment can he 
accomplished by a variety of methods. Since the spectral calibration of fluorescent 
polynucleotide separation apparatus is accomplished under controlled conditions (known and 
prespecified materials and experimental parameters,) an efficient way to identify the labeled 

1 0 polynucleotides of the multiple color calibration standards is to take advantage of the controlled 
experimental conditions and the design of the colored ladder. For example, the multiple color 
spectral calibration standard design may be such that the fragment labeled with the dye DRl 1 0 
in a multiple color calibration standard has the largest migration time. Under optimized and 
controlled experimental conditions, where the peak width and peak S/N ratio parameters allow 

1 5 multiple color calibration standard constituent polynucleotides to be detected and retained, the 
last peak would be the DRl 10-labeled firagment. A peak with such a high probability of being 
detected may serve as a reference peak to locate peaks corresponding to the other labeled 
polynucleotides of the multiple color calibration standard. Since the migration of a labeled DNA 
fragment is influenced primarily by the size of the DNA fragment, the labeling dye and the 

20 separation matrix, migration time offsets over a short migration interval are effective parameters 
to use in locating the peaks corresponding to the labeled polynucleotides of the multiple color 
calibration standards given the location of a reference peak such as the DRl 10-labeled peak. 

If the mobilities of the labeled polynucleotides of the standard exhibit significant 
nonlinearities, and the migration of the colored ladder fragments is not easily (and reliably) 

25 predictable over a large range of migration times using offsets from one reference peak, the 
prediction range may be reduced by relying on effects fi-om neighboring peaks. For example, a 
polynucleotide labeled with DRl 1 0 may be used as a reference peak to locate the polynucleotide 
(in the same multiple color calibration standard mixture) labeled with DR6G. Subsequently, the 
polynucleotide labeled with DR6G (in the same standard) may serve as a reference peak to locate 

30 the polynucleotide labeled with DTAM. The polynucleotide labeled with DTAM (in the same 
standard) may then used to locate the polynucleotide labeled with DROX. Finally, the 
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polynucleotide labeled with DROX (in the same standard) may serve as a reference peak to locate 
the polynucleotide labeled with JAZ. 



Peak Detection Parameters 
5 The input parameters of labeled polynucleotides of the multiple color calibration 

standards for peak detectors may include, but are not limited to: _ 

(a) The starting point and the sample size to be used in estimating the analytical 
background and the analytical noise in the total emission temporal profile (o(I) in Equation 2.) 
The analytical background and noise are used to assess the peak S/N ratio. 
10 (b) The threshold variable corresponding to k in Equation 3. This determines the 

sensitivity of the peak detector to baseline variations. 

(c) The threshold variable to be used in detecting baseline segments to the left of peak 
starting points and to the right of peak ending points, where available. Typically, this is a value 
less than that used for detecting peak starting points 
1 5 (d) Minimum peak width and peak S/N ratio requirements. These two parameters are 

selected such that spikes and non-target peaks are ignored. Ideally, only the peaks corresponding 
to the fragments of the colored ladder are retained by the peak detector. 

(e) Reference peak migration time and its tolerance. If this parameter is zero, the last 
peak found is by default the reference peak. 
20 (f) Migration time offsets of the colored ladder fragment peaks and their tolerances. 

(g) The appropriate search windows for maxima and baseline values for the emission 
temporal profiles, 

(h) Number of the colored ladder fragment peaks and the maximum number of peaks 
expected to be found in the total emission temporal profile. These parameters are used for 

25 memory management 

Estimation of I>ves' Reference Spectra 

The process of spectral calibration of fluorescent polynucleotide separation apparatus 
using multiple color calibration standard may include the step of the estimating (extracting) of 
30 the dyes' reference spectra from the acquired data matrix, D, using information from the peak 
detection process. As stated earlier the rows of the data matrix, D, contain the spectral 
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information. Any spectrum acquired during any data collection/acquisition period can be 
estimated from the net analytical signals obtained in the spectral channels/bins. A spectrum is, 
thus, a background/baseline corrected row of D. 

The dyes' reference spectra are, therefore, estimated from the corrected rows of D that 
5 correspond to data points along the peak segments of the total emission temporal profile. The 
peak maximum is the data point (row of D) recommended for estimating the dyes' reference 
spectra. Since the emission temporal profiles of the individual spectral channels/bins are not 
expected to be perfectly parallel, a row of D is corrected by estimating the net analytical signal 
in each spectral channel/bin using the peak detection information from the total emission 
1 0 temporal profile and appropriate search windows. Spectral calibration reference spectra are, also, 
normalized such that the maximum spectral intensity in each spectrum is set to equal 1 . This is 
accomplished by dividing all corrected spectral intensities in each spectrum by the maximum 
corrected spectral intensity foimd in the spectrum. 

15 Uncertainties in Dyes' Reference Spectra 

The spectral intensity in a particular channel/bin of a normalized dye's reference 
spectrum can be expressed as: 



where is the normalized spectral intensity in the reference spectrum at the ith spectral 
channel/bin, 

Ij is the net analytical signal iii the ith spectral channel/bin, and 
I„ is the highest net analytical signal in the spectrum. 
25 The uncertainty in Rj is given according to: 



(4) 



20 



a^(RO/Ri^ = 



(5) 



30 



where m is given as I/Ih,, and 

the variance in the spectral intensities and is assumed to be equivalent 
in both spectral channelsA)ins. 
The relative error in R{ may be expressed according to: 



[l/SNRi][l+m^] 



(6) 
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where SNR^ is the signal-to-noise ratio of the net analytical signal in the ith spectral channel/bin. 

The term [1 + m^] in Equations 5 and 6 never exceeds the value of 2 according to the 
normalization defined by Equation 4. The relative error in can, therefore, be expressed as: 



where SNRj is the signal-to-noise ratio of the net analytical signed in the ith spectral channel/bin. 

The analytical implication of Equation 6 (and Equation 7) is that the quality of the dyes' 
reference spectra increases (/. e, , the relative errors in the spectral bins decreases) as the signal-to- 
10 noise ratio of the net analytical signal increases. The reliability of spectral estimation is 
determined primarily by the signal-to-noise ratio, not by the number of spectra being used to 
obtain an average estimate. Since the spectra acquired at peaks' maxima have the highest S/N 
ratio, these spectra are the preferred spectra to be selected as reference spectra as they are 



expected to have the lowest relative errors. However, other spectra that substantially correspond 

15 to the peak maxima may also be used as reference spectra. 

Other embodiments of the invention include systems for separating and detecting 
fluorescently labeled polynucleotides, wherein the system is designed for spectral calibration in 
accordance v^th the subject calibration methods employing multiple color calibration standards. 
The subject systems comprise a fluorescent polynucleotide separation apparatus and a computer 

20 in functional combination with the apparatus. The terai "in functional combination" is used to 
indicate that data from the fluorescent polynucleotide separation apparatus, such data including 
fluorescence intensity data over a range of detection wavelength and the associated temporal 
data, is transferred to the computer in such a form that the computer may use the data for 
calculation purposes. The computer in the system of the invention is programmed to perform the 

25 spectral calibration method of the invention using the data produced from running a multiple 
color spectral calibration standard. Thus the computer is progranmied to produce a total emission 
temporal profile from the spectral and temporal data obtained from the calibration run. The 
computer may also be programmed to detect peaks in the total emission temporal profile, and 
determine reference spectral profiles of the dyes attached to the labeled polynucleotide 

30 represented by the peaks. A wide variety of computers may be used in the subject system. 
Typically, the computer is a microprocessor and the attendant input, output, memory, and other 
components required to perform the necessary calculations. The computers may be generally 



o(R,)/R, 



[l/SNR,]>/2 



(7) 
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programmable so as to facilitate modifications or the apparatus of the computer program may be 
in the form of "firmware" that is not readily subjected to modification. 

Other embodiments of the invention include systems for calibrating a fluorescent 
polynucleotide separation apparatus. The calibration systems includes computer code that 
5 receives a plurality of spectral and temporal data from a fluorescent polynucleotide separation 
apparatus. The system also comprises computer code that calculate a total emission temporal 
profile from the spectral and temporal data. The system may further comprise additional 
computer code for performing the subject methods of spectral calibration. Such additional code 
includes code for detecting peaks, and code for preparing spectral profile for each of the dyes 

1 0 included in a calibration standard. As the computer code requires of the subject system requires 

a physical embodiment to ftinction, the system also comprises a processor and computer readable * 
medium (e.g. optical or magnetic storage medium) for storing the computer program code. The 
computer readable medimn is fiinctionally coupled to the processor. 

Another aspect of the invention is methods and compositions for detecting the flow of 

15 electrical current through a separation channel of a fluorescent polynucleotide separation 
apparatus. Such methods and compositions are particularly useful with fluorescent 
polynucleotide separation apparatus that employ multiple separation channels, e.g. a multi 
capillary or multiple microchannel system, because of interruptions in current flow in individual 
separation channels may be difficult to detect if a substantial percentage of the channels have 

20 proper current flow. The subject electrical flow monitoring methods involve the use of 
fluorescent dyes that are spectrally distinct &om fluorescently labeled polynucleotides of primary 
interest. These spectrally distinct fluorescent dyes are referred to herein as monitoring dyes. In 
a preferred embodiment of the invention, the monitoring dye is selected so as to produce 
significant emission when excited by the same excitation source or sources used to excite the 

25 other fluorescent dyes in the composition of interest. 

For example, a polynucleotide sequencing reaction product mixture (chain termination 
sequencing) may contain (1) four spectrally distinct fluorescent dyes, wherein each of the four 
dyes is correlated with a different polynucleotide base {e.g. fluorescently labeled dideoxy 
sequencing) and (2) a monitoring dye that is spectrally distinct fi-om the four other dyes. 

30 Movement of the monitoring dye in a separation channel can be used to confirm that current flow 
and therefore proper separation of the sequencing reaction products is occurring. Monitoring 
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dyes may be used in conjunction with sequencing reaction mixtures that employ either more or 
less than four dyes. 

Another aspect of the invention is methods and compositions for detecting the flow of 
electrical current through a separation channel of a fluorescent polynucleotide separation 
5 apparatus. Such methods and compositions are particularly useful with fluorescent 
polynucleotide separation apparatus that employ multiple separation channels, e.g. a multi 
capillary or multiple microchannel system, because of the possibility of failure of a subject of the 
separation channel. The subject electrical current flow monitoring methods involve the use of 
fluorescent dyes that are spectrally distinct from fluorescently labeled polynucleotides of primary 

1 0 interest. These spectrally distinct fluorescent dyes are referred to herein as monitoring dyes. In 
a preferred embodiment of the invention, the monitoring dye is selected so as to produce 
significant emission when excited by the same excitation source or sources used to excite the 
other fluorescent dyes in the composition of interest 

For example, a polynucleotide sequencing reaction product mixture (chain termination 

15 sequencing) may contain (1) four spectrally distinct fluorescent dyes, wherein each of the four 
dyes is correlated with a different polynucleotide base (e,g, fluorescently labeled dideoxy 
sequencing) and (2) a monitoring dye that is spectrally distinct from the four other dyes. 
Movement of the monitoring dye in a separation channel can be used to confirm that current flow 
and therefore proper separation of the sequencing reaction products is occurring. Monitoring dyes 

20 can be used in conjunction with sequencing reaction mixtures that employ either more or less 
than four dyes, e.g., one color or two color based sequencmg. 

Monitoring dyes may also be used in conjunction with other forms of fluorescent 
polynucleotide fhigment analysis in addition to polynucleotide sequencing. Such other forms 
of analysis include nucleic acid amplification products, ligation products, and the like. 

25 The monitoring dyes may be used by themselves or may be conjugated to other 

molecules that can modify the migration rate of the monitoring dyes during electrophoresis, Le. , 
a mobility modifier. Examples of such migration modifying molecules include polynucleotides, 
polynucleotide analogs, peptides, polypeptide, the mobility modifying molecules described in 
U.S. Patent No. 5,514,543, and the like. Preferably, these mobility modifying molecules are 

30 selected so as to not have spectral properties that interfere with fluorescent detection of the dyes 
of interest Detailed descriptions ofhow to conjugate fluorescent dyes to various compounds can 
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be found in, among other places, Hermanson, Bioconiugate Techniques . Academic Press, San 
Diego, CA (1996). Unless indicated otherwise by context of usage, the term "monitoring dye" 
includes monitoring dye conjugates. 

Embodiments of the invention include compositions comprising fluorescently labeled 
5 polynucleotides and one or more monitoring dyes, wherein the monitoring dyes are spectrally 
distinct from the other fluorescent dyes in the mixture. The monitoring dyes may be added_to 
the composition either before, after, or during the formation of the fluorescently labeled 
polynucleotides for analysis. For example, a monitoring dye may be added to a polynucleotide 
sequencing reaction either before or after the reaction is terminated. In some embodiments of 

10 the invention, the subject compositions comprise multiple different monitoring dyes. In such 
embodiments, the monitoring dyes are preferably conjugates having different electrophoretic 
mobilities. In other embodiments of the subject compositions, a single signal fluorescent dye 
is present, but the dye molecules are conjugated to two or more different mobility modifier 
species so as to produce multiple opportunities to detect the monitoring dye during 

15 electrophoretic separation. 

The invention also includes methods of detecting the flow of electrical current through 
a separation channel of a fluorescent polynucleotide separation apparatus by introducing a 
fluorescently labeled polynucleotide composition into a channel of a fluorescent polynucleotide 
separation apparatus. The fluorescently labeled polynucleotide composition comprises a 

20 polynucleotide labeled with a first fluorescent dye and a monitoring dye that is spectrally distinct 
from the first fluorescent dye. In most embodiments of the invention, the fluorescently labeled 
polynucleotide is a complex mixture of different length polynucleotides. Exemplary of such 
fluorescently labeled polynucleotide mixtures are the products of DNA sequencing reactions 
employing either fluorescently labeled primers or fluorescently labeled terminators, PGR 

25 amplification products formed by using fluorescently labeled primers, fluorescently labeled mini- 
sequencing reactions, products, fluorescently labeled oligonucleotide ligation reactidii products, 
and the like. Such reactions produce genetic information that may be analyzed in the fluorescent 
polynucleotide separation apparatus. The monitoring dye is spectrally distinct fi-om the 
fluorescent dyes used to label the polynucleotides that convey genetic information. For example, 

30 the invention includes a composition comprising a the complex mixture of different fluorescently 
labeled polynucleotides produced firom four color chain termination sequencing and signal dye 
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that is spectrally distinct from the four fluorescent dyes on the different sequencing reaction 
products. 

After the fluorescently labeled polynucleotide composition is introduced in the 
separation channel of a fluorescent polynucleotide separation apparatus, the apparatus is activated 
5 and the polynucleotide (and signal dyes, if not joined o a polynucleotide) permitted to separate 
along the separation channel. The movement of the monitoring dye through the separadpn 
channel may then be detected by the apparatus. Lack of movement of the monitoring dye (or 
dyes) or permutations of the movement of the monitoring dyes through the separation channels 
may be used to detect problems with the flow of electrical current through the separation channel. 
10 The movement of monitoring dyes in different channels of a multiple channel fluorescent 
polynucleotide separation apparatus may be compared with one another so as to facilitate the 
detection of problems with current flow. 

Embodiments of the invention also include computer code for using monitoring dyes 
to monitor current flow in the subject methods, computer storage media embodying such code, 
1 5 and progranmiable electronic computer programmed with such code. 

INCORPORATION BY REFERENCE 
All publications, patent applications, and patents referenced in the specification are herein 
incorporated by reference to the same extent as if each individual publication or patent 
20 application was specifically and individxially indicated to be incorporated by reference. 

EOUIVALENTS 

All publications, patent applications, and patents mentioned in this specification are 
indicative of the level of skill of those skilled in the art to which this invention pertains. 
Although only a few embodiments have been described in detail above, those having ordinary 

25 skill in the molecular biology art will clearly understand that many modifications are possible 
in the preferred embodiment without departing from the teachings thereof All such 
modifications are intended to be encompassed within the following claims. The foregoing written 
specification is considered to be sufficient to enable skilled in the art to which this invention 
pertains to practice the invention. Indeed, various modifications of the above-described modes 

30 for carrying out the invention which are apparent to those skilled in the field of molecular 
biology or related fields are intended to be within the scope of the following claims. 
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1 . A method of calibrating a fluorescent polynucleotide separation apparatus, said 
5 method comprising the steps, introducing a fluorescent polynucleotide separation standard into 

said apparatus, wherein the standard comprises at least two polynucleotides of different length, 
each of the polynucleotides being labeled with a spectrally distinct fluorescent dye, separating 
the polynucleotides from each other, detecting the separated polynucleotides with a detector, 
wherein the detector collects spectral data from the separated polynucleotides over a plurality of 
1 0 spectral channels, collects and temporal data from the separated polynucleotides over a plurality 
of temporal points, and generating a total emission temporal profile from the spectral and 
temporal data. 

2. The method according to claim 1, wherein the separation standard comprises four 
1 5 polynucleotide each of different length and labeled with a spectrally distinct dye. 

3. The method of claim 2, wherein the length of the polynucleotides is selected so as 
minimize the spectral overlap of the fluorescent dye at each point of detection for the 
polynucleotides. 

20 

4. The method of claim 1, wherein the separation standard comprises five 
polynucleotide each of different length and labeled with a spectrally distinct dye. 

5 . The method according to claim 1 , fiirther comprising the step of detecting the peaks 
25 in the total emission temporal profile. 

* 

6. The method according to claim 5, fiirther comprising the step of selecting a reference 
spectrum for each of the fluorescent dyes, wherein each reference spectrum substantially 
corresponds to peak of the emission temporal profile. 

30 
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7. The method of claim 6, wherein each reference spectrum is corrected by estimating 
the net analytical signal for each spectral channel 

8. The method according to claim 4, wherein the peaks are detected by a transformation 
according to the equation: 



where S^ is the slope (as estimated by the detection transformation) at point I, and I ^ 
is the intensity of the total emission temporal profile at point k. 

9. A system for separating and detecting fluorescently labeled polynucleotides 
comprising, fluorescent polynucleotide separation apparatus, a computer in functional 
combination with the fluorescent polynucleotide separation apparatus, wherein the computer is 
programmed to produce a total emission temporal profile fi-om a calibration standard comprising 
at least two polynucleotides of different length, each of the polynucleotides being labeled with 
a spectrally distinct fluorescent dye. 

10. The system according to claim 9, wherein the computer is programmed to detect 
the peaks in the total emission temporal profile. 

1 1 . The system according to claim 10, wherein a reference spectrum for each of the 
fluorescent dyes is produced by selecting a reference spectrum that substantially corresponds to 
a peak of the emission temporal profile. 

12. The system according to claim 11 wherein the computer corrects each reference 
spectrum is by estimating the net analytical signal for each spectral channel. 

13. A system for calibrating a fluorescent polynucleotide separation ^paratus, said 
system comprising a processor and, a computer readable mediimi functionally coupled to said 
processor for storing a computer program comprising: 



Si = 



(I^, -1,) + (Ih2 -ImX 
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computer code that receives plurality of spectral and temporal data from a 
fluorescent polynucleotide separation apparatus, and computer code that calculates a 
total emission temporal profile from the spectral and temporal data. 

14. A calibration standard for a fluorescent polynucleotide separation apparatus, 
standard comprising four polynucleotides of different length, each polynucleotide labeled with 
a different fluorescent dye having a distinctive spectral profile having a peak, wherein the length 
of each of the polynucleotides is such that the peak of the spectral profile of each dye does not 
significantly overlap between the separated fragment. 

15. A kit for producing a calibration standard of claim 14, wherein the fluorescent 
labeled polynucleotides are stored in separate containers. 

16. A method of detecting the flow of electrical current through a separation charmel 
of fluorescent polynucleotide separation apparatus, said method comprising the steps, introducing 
a fluorescently labeled polynucleotide composition to a channel of a fluorescent polynucleotide 
separation apparatus, said composition comprising a polynucleotide labeled with a first 
fluorescent dye, a monitoring dye that is spectrally distinct from the fluorescent dye, and 
detecting the monitoring dye, 

1 7. The method according to claim 16, wherein the composition comprises a plurality 
of polynucleotides labeled with at least two spectrally distinct fluorescent dyes, wherein the 
monitoring dye is spectrally distinct from the fluorescent dyes. 

18. The method of claim 17, wherein the polynucleotides labeled with at least two 
spectrally distinct fluorescent dyes is a polynucleotide sequencing reaction product mixture. 

i 

19. The method of claim 18, wherein the monitoring dye is attached to a 
polynucleotide. 
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20. A composition for monitoring the flow of electric current through a separation 
channel of fluorescent polynucleotide separation apparatus, said composition comprising a 
fluorescent dye labeled polynucleotide composition, and a monitoring dye that is spectrally 
distinct from the fluorescent dyes of the polynucleotide composition. 

21. The composition according to claim 20, wherein the fluorescent dye labeled 
polynucleotide composition is a polynucleotide sequencing reaction product mixture. 
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